21 research outputs found
Details of Deformable Part Models for Automatically Georeferencing Historical Map Images
Libraries are digitizing their collections of maps from all eras, generating increasingly large online collections of historical cartographic resources. Aligning such maps to a modern geographic coordinate system greatly increases their utility. This work presents a method for such automatic georeferencing, matching raster image content to GIS vector coordinate data. Given an approximate initial alignment that has already been projected from a spherical geographic coordinate system to a Cartesian map coordinate system, a probabilistic shape-matching scheme determines an optimized match between the GIS contours and ink in the binarized map image. Us- ing an evaluation set of 20 historical maps from states and regions of the U.S., the method reduces average alignment RMSE by 12%
Deformable Part Models for Automatically Georeferencing Historical Map Images
Libraries are digitizing their collections of maps from all eras, generating increasingly large online collections of historical cartographic resources. Aligning such maps to a modern geographic coordinate system greatly increases their utility. This work presents a method for such automatic georeferencing, matching raster image content to GIS vector coordinate data. Given an approximate initial alignment that has already been projected from a spherical geographic coordinate system to a Cartesian map coordinate system, a probabilistic shape-matching scheme determines an optimized match between the GIS contours and ink in the binarized map image. Using an evaluation set of 20 historical maps from states and regions of the U.S., the method reduces average alignment RMSE by 12%
Techniques and applications for persistent backgrounding in a humanoid torso robot
Abstract-One of the most basic capabilities for an agent with a vision system is to recognize its own surroundings. Yet surprisingly, despite the ease of doing so, many robots store little or no record of their own visual surroundings. This paper explores the utility of keeping the simplest possible persistent record of the environment of a stationary torso robot, in the form of a collection of images captured from various pan-tilt angles around the robot. We demonstrate that this particularly simple process of storing background images can be useful for a variety of tasks, and can relieve the system designer of certain requirements as well. We explore three uses for such a record: auto-calibration, novel object detection with a moving camera, and developing attentional saliency maps
Recommended from our members
SIGN DETECTION IN NATURAL IMAGES WITH CONDITIONAL RANDOM FIELDS
Traditional generative Markov random elds for seg- menting images model the image data and corresponding labels jointly, which requires extensive independence assumptions for tract- ability. We present the conditional random eld for an application in sign detection, using typical scale and orientation selective tex- ture lters and a nonlinear texture operator based on the grating cell. The resulting model captures dependencies between neighbor- ing image region labels in a data-dependent way that escapes the dicult problem of modeling image formation, instead focusing ef- fort and computation on the labeling task. We compare the results of training the model with pseudo-likelihood against an approxima- tion of the full likelihood with the iterative tree reparameterization algorithm and demonstrate improvement over previous methods
Recommended from our members
Unified detection and recognition for reading text in scene images
Although an automated reader for the blind first appeared nearly two-hundred years ago, computers can currently read document text about as well as a seven-year-old. Scene text recognition brings many new challenges. A central limitation of current approaches is a feed-forward, bottom-up, pipelined architecture that isolates the many tasks and information involved in reading. The result is a system that commits errors from which it cannot recover and has components that lack access to relevant information. We propose a system for scene text reading that in its design, training, and operation is more integrated. First, we present a simple contextual model for text detection that is ignorant of any recognition. Through the use of special features and data context, this model performs well on the detection task, but limitations remain due to the lack of interpretation. We then introduce a recognition model that integrates several information sources, including font consistency and a lexicon, and compare it to approaches using pipelined architectures with similar information. Next we examine a more unified detection and recognition framework where features are selected based on the joint task of detection and recognition, rather than each task individually. This approach yields better results with fewer features. Finally, we demonstrate a model that incorporates segmentation and recognition at both the character and word levels. Text with difficult layouts and low resolution are more accurately recognized by this integrated approach. By more tightly coupling several aspects of detection and recognition, we hope to establish a new unified way of approaching the problem that will lead to improved performance. We would like computers to become accomplished grammar-school level readers
Recommended from our members
Improving Recognition of Novel Input with Similarity
Many sources of information relevant to computer vision and machine learning tasks are often underused. One example is the similarity between the elements from a novel source, such as a speaker, writer, or printed font. By comparing instances emitted by a source, we help ensure that similar instances are given the same label. Previous approaches have clustered instances prior to recognition. We propose a probabilistic framework that unifies similarity with prior identity and contextual information. By fusing information sources in a single model, we eliminate unrecoverable errors that result from processing the information in separate stages and improve overall accuracy. The framework also naturally integrates dissimilarity information, which has previously been ignored. We demonstrate with an application in printed character recognition from images of signs in natural scenes